Submitted to Eurospeech’99, Budapest MULTI-STREAM SPEECH RECOGNITION: READY FOR PRIME TIME?
نویسندگان
چکیده
Multi-stream and multi-band methods can improve the accuracy of speech recognition systems without overly increasing the complexity. However, they cannot be applied blindly. In this paper, we review our experience applying multi-stream and multiband methods to the Broadcast News corpus. We found that multi-stream systems using different acoustic front-ends provide a significant improvement over single stream systems. However, despite the fact that they have been successful on smaller tasks, we have not yet been able to show any gain using multi-band methods. We report various insights gained from the experience in applying these methods in a large-vocabulary task.
منابع مشابه
Multi-stream speech recognition: ready for prime time?
Multi-stream and multi-band methods can improve the accuracy of speech recognition systems without overly increasing the complexity. However, they cannot be applied blindly. In this paper, we review our experience applying multi-stream and multiband methods to the Broadcast News corpus. We found that multi-stream systems using different acoustic front-ends provide a significant improvement over...
متن کاملSubmitted to Eurospeech’99, Budapest SPEECH/MUSIC DISCRIMINATION BASED ON POSTERIOR PROBABILITY FEATURES
A hybrid connectionist-HMM speech recognizer uses a neural network acoustic classifier. This network estimates the posterior probability that the acoustic feature vectors at the current time step should be labelled as each of around 50 phone classes. We sought to exploit informal observations of the distinctions in this posterior domain between nonspeech audio and speech segments well-modeled b...
متن کاملIs ASR ready for wireless primetime: Measuring the core technology for selected applications
It is estimated that by the end of 2001 as many as 500 million people worldwide will use cellular services. The nature of hands-busy and eyes-busy situations inherent in the anywhere and anytime wireless communication paradigm presents exciting marketing opportunities and, at the same time, unique technical challenges to the current-generation ASR technology and their new applications. Current ...
متن کاملMultimedia interaction for the new millennium
Spoken language processing has created value in multiple application areas such as document transcription, data base entry, and command and control. Recently scientists have been focusing on a new class of application that promises on-demand access to multimedia information such as radio and broadcast news. In separate research, augmenting traditional graphical interfaces with additional modali...
متن کاملUsing multiple time scales in a multi-stream speech recognition system
In this paper we propose and investigate a new approach towards using multiple time scale information in auto matic speech recognition ASR systems In this frame work we are using a particular HMM formalism able to process di erent input streams and to recombine them at some temporal anchor points While the phonological level of recombination has to be de ned a priori the op timal temporal ancho...
متن کامل